Skip to content

Conversation

@forsyth2
Copy link
Collaborator

@forsyth2 forsyth2 commented Sep 15, 2025

Summary

Objectives:

  • Update Python support to 3.11+

Select one: This pull request is...

  • a bug fix: increment the patch version
  • a small improvement: increment the minor version
  • a new feature: increment the minor version
  • an incompatible (non-backwards compatible) API change: increment the major version

Small Change

  • To merge, I will use "Squash and merge". That is, this change should be a single commit.
  • Logic: I have visually inspected the entire pull request myself.
  • Pre-commit checks: All the pre-commits checks have passed.

@forsyth2 forsyth2 self-assigned this Sep 15, 2025
@forsyth2 forsyth2 added the DevOps CI/CD, configuration, etc. label Sep 15, 2025
Copy link
Collaborator Author

@forsyth2 forsyth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomvothecoder Can you do a quick review of this please?

My testing steps
cd ~/ez/zstash
git fetch upstream main
git checkout -b update-python upstream/main
# Make changes to support python3.11+
nersc_conda
rm -rf build
conda clean --all --y
conda env create -f conda/dev.yml -n zstash-py-3-11-20250915
conda activate zstash-py-3-11-20250915
pre-commit run --all-files
python -m pip install .
python -m unittest tests/test_*.py
# FAILED (errors=4, skipped=1)
# Globus test was probably skipped because authentications were missing.
# ERROR: testCheckParallelVerboseMismatch (tests.test_check_parallel.TestCheckParallel.testCheckParallelVerboseMismatch)
# ERROR: testCheckParallelVerboseMismatchHPSS (tests.test_check_parallel.TestCheckParallel.testCheckParallelVerboseMismatchHPSS)
# ERROR: testCheckVerboseMismatch (tests.test_check.TestCheck.testCheckVerboseMismatch)
# ERROR: testCheckVerboseMismatchHPSS (tests.test_check.TestCheck.testCheckVerboseMismatchHPSS)
# FileNotFoundError: [Errno 2] No such file or directory: 'sqlite3'
python -m unittest tests.test_check.TestCheck.testCheckVerboseMismatch
# FileNotFoundError: [Errno 2] No such file or directory: 'sqlite3'
git add -A
git commit -m "Update Python support"

git checkout -b test-main upstream/main
rm -rf build
conda clean --all --y
conda env create -f conda/dev.yml -n zstash-py-3-9-20250915
conda activate zstash-py-3-9-20250915
pre-commit run --all-files
python -m pip install .
python -m unittest tests.test_check.TestCheck.testCheckVerboseMismatch # OK
python -m unittest tests.test_check.TestCheck.testCheckVerboseMismatch # OK
python -m unittest tests.test_check_parallel.TestCheckParallel.testCheckParallelVerboseMismatchHPSS # OK
python -m unittest tests.test_check_parallel.TestCheckParallel.testCheckParallelVerboseMismatch # OK

git checkout update-python
git push upstream update-python
# https://github.com/E3SM-Project/zstash/pull/384/files

# Add `- sqlite`
rm -rf build
conda clean --all --y
conda env create -f conda/dev.yml -n zstash-py-3-11-sqlite-20250915
conda activate zstash-py-3-11-sqlite-20250915
pre-commit run --all-files
python -m pip install .
python -m unittest tests.test_check.TestCheck.testCheckVerboseMismatch # OK
python -m unittest tests.test_check.TestCheck.testCheckVerboseMismatchHPSS # OK
python -m unittest tests.test_check_parallel.TestCheckParallel.testCheckParallelVerboseMismatchHPSS # OK
python -m unittest tests.test_check_parallel.TestCheckParallel.testCheckParallelVerboseMismatch # OK
git add -A
git commit -m "Add explicit sqlite dependency"

- pip=22.2.2
- python=3.9.13
- python=3.11
- sqlite
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomvothecoder How do we determine the constraints (=, >=, etc.)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is based on the conda-forge recipe and compatibility with other packages. If things break with new package versions, that suggests updating the constraints accordingly.

@forsyth2 forsyth2 marked this pull request as ready for review September 15, 2025 23:50
Copy link
Collaborator

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added review comments. Also make sure to update the setup.py: https://github.com/E3SM-Project/zstash/blob/main/setup.py to Python >=3.11.

At some point, this file should be converted to a pyproject.toml which is the modern standard.

conda/dev.yml Outdated
# =================
- pip=22.2.2
- python=3.9.13
- python=3.11
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dev.yml constraints should align to the conda-forge recipe. Currently, the constraints are too restrictive with exact versions pinned.

Follow this for more guidance if needed: E3SM-Project/e3sm_diags#1006

@forsyth2
Copy link
Collaborator Author

Added f680ec5 but now getting AttributeError: module 'pkgutil' has no attribute 'ImpImporter'. Did you mean: 'zipimporter'?...

Testing
nersc_conda
cd ~/ez/zstash
git status
# Check for uncommitted changes
git checkout update-python
git log
# Matches https://github.com/E3SM-Project/zstash/pull/384/commits
# Update Python support, Add explicit sqlite dependency

# Make edits based on:
git grep -n python
git grep -n "3\.9"

# Make sure these are all >=3.11 or =3.13
git grep -n "3\.11"

# Now:
git grep -n "3\.9" # No results
git grep -n "3\.11"
# conda/dev.yml:9:  - python >=3.11,<3.14
# conda/meta.yaml:19:    - python >=3.11,<3.14
# conda/meta.yaml:23:    - python >=3.11,<3.14
# setup.py:10:    python_requires=">=3.11,<3.14",
git grep -n "3\.13"
# .github/workflows/build_workflow.yml:22:      - name: Set up Python 3.13
# .github/workflows/build_workflow.yml:25:          python-version: 3.13
# setup.cfg:49:python_version = 3.13
git grep -n "3\.14"
# conda/dev.yml:9:  - python >=3.11,<3.14
# conda/meta.yaml:19:    - python >=3.11,<3.14
# conda/meta.yaml:23:    - python >=3.11,<3.14
# setup.py:10:    python_requires=">=3.11,<3.14",

rm -rf build
conda clean --all --y
conda env create -f conda/dev.yml -n zstash-py-3-13-sqlite-20250916
conda activate zstash-py-3-13-sqlite-20250916
pre-commit run --all-files
git add -A
python -m pip install .
# AttributeError: module 'pkgutil' has no attribute 'ImpImporter'. Did you mean: 'zipimporter'?
# TODO:
python -m unittest tests/test_*.py

@forsyth2
Copy link
Collaborator Author

Ok, appear to have everything working now:

python -m ensurepip --upgrade
python -m pip install --upgrade setuptools wheel
git fetch upstream
# Get rid of fair-research-login dependency by using latest `main`:
git rebase upstream/main
rm -rf build
conda clean --all --y
conda env create -f conda/dev.yml -n zstash-20250916v2
conda activate zstash-20250916v2
pre-commit run --all-files
git add -A
python -m pip install .
# Still getting:
# AttributeError: module 'pkgutil' has no attribute 'ImpImporter'. Did you mean: 'zipimporter'?

# Use conda's python to ensure pip in the right environment
python -m ensurepip --upgrade
python -m pip install --upgrade --force-reinstall setuptools wheel pip
python -m pip install .
# Success
python -m unittest tests/test_*.py
# Success

@forsyth2
Copy link
Collaborator Author

forsyth2 commented Sep 17, 2025

While I got it working for me on Perlmutter, the GitHub Actions were still failing. After substantial iteration with Claude, I arrived at some fixes. This now runs GitHub Actions on Python 3.11 and 3.12.

Steps
# Add Claude's Matrix testing suggestion
pre-commit run --all-files
git add -A
python -m pip install .
# Success
python -m unittest tests/test_*.py
# Success
git commit -m "Add matrix testing"
git push upstream update-python
# Still fails on GitHub Actions

# Additional fixes to the github workflow
pre-commit run --all-files
git add -A
git commit -m "Additional fixes"
git push upstream update-python
# Still fails on GitHub Actions

# More changes
pre-commit run --all-files
git add -A
git commit -m "Fixes 3rd try"
git push upstream update-python
# 3.11 alone works!

# Add back matrix testing -- 3.11, 3.12, 3.13
pre-commit run --all-files
git add -A
python -m pip install .
# Success
python -m unittest tests/test_*.py
# Success
git commit -m "Fixes 4th try"
git push upstream update-python
# 3.11, 3.12 get cancelled. 3.13 errors.

# Remove 3.13
pre-commit run --all-files
git add -A
git commit -m "Fixes 5th try"
git push upstream update-python
# 3.11, 3.12 work!

Copy link
Collaborator Author

@forsyth2 forsyth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomvothecoder Can you please review the latest version of changes? Thanks!

I addressed your comments and made some other substantial changes to get pip install working & GitHub Actions passing.

If this looks good, I can make similar changes for the equivalent PRs at E3SM-Project/zppy#735 & E3SM-Project/zppy-interfaces#33.

conda/dev.yml Outdated
channels:
- conda-forge
- defaults
- nodefaults # This replaces 'defaults' to avoid Anaconda commercial repos
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In one of the fix attempts, GitHub Actions failed because of some licensing issue, so Claude suggested this fix. Although, looking at the final PR diff, it seems strange defaults worked fine before...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just specify conda-forge and remove defaults. I don't know whether nodefaults is needed or not.


[mypy]
python_version = 3.9
python_version = 3.13
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we choose the specific version for this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest stable version of Python.

@forsyth2
Copy link
Collaborator Author

Re: Python 3.13 support -- python =3.13 * is not installable because it conflicts with any installable versions previously reported. appears in https://github.com/E3SM-Project/zstash/actions/runs/17783224150/job/50546110928. The newly added matrix testing only tests for 3.11 and 3.12.

I still need to add matrix testing to zppy/zppy-interfaces. Previously, we've just tested on one version.

@tomvothecoder
Copy link
Collaborator

Re: Python 3.13 support -- python =3.13 * is not installable because it conflicts with any installable versions previously reported. appears in E3SM-Project/zstash/actions/runs/17783224150/job/50546110928. The newly added matrix testing only tests for 3.11 and 3.12.

As mentioned here, the dev.yml constraints are too restrictive with the static pins in the Base section and the static pins for QA and docs tooling are probably outdated. This is most likely producing the incompatibility with Python 3.13.

Suggestions --

  1. Align Base section with conda-forge feedstock recipe (meta.yaml)
  2. Update the static pins for the other of the sections to the latest versions -- fix any formatting/linting issues that come up (also update https://github.com/E3SM-Project/zstash/blob/update-python/.pre-commit-config.yaml)

Copy link
Collaborator

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My quick review. TLDR: Update dev.yml and try with Python 3.13 again.

- pip=22.2.2
- python=3.9.13
- python=3.11
- sqlite
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is based on the conda-forge recipe and compatibility with other packages. If things break with new package versions, that suggests updating the constraints accordingly.


[mypy]
python_version = 3.9
python_version = 3.13
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest stable version of Python.

conda/dev.yml Outdated
channels:
- conda-forge
- defaults
- nodefaults # This replaces 'defaults' to avoid Anaconda commercial repos
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just specify conda-forge and remove defaults. I don't know whether nodefaults is needed or not.

runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.11, 3.12]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
python-version: [3.11, 3.12]
python-version: [3.11, 3.12, 3.13]

uses: actions/checkout@v3

- name: Set up Python 3.9
- name: Set up Python 3.11
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: Set up Python 3.11
- name: Set up Python 3.13

uses: actions/setup-python@v4
with:
python-version: 3.9
python-version: 3.11
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
python-version: 3.11
python-version: 3.13

Copy link
Collaborator Author

@forsyth2 forsyth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomvothecoder, thanks for your earlier tips. Can you please review this latest iteration? It appears to be passing all the GitHub Actions (for Python 3.11, 3.12, 3.13) now. Thanks!

Comment on lines 285 to 297

# Add the file to the tar - ONLY change this line for Python 3.13+
if (
sys.version_info >= (3, 13)
and (tarinfo.isfile() or tarinfo.islnk())
and tarinfo.size > 0
):
# Python 3.13+ requires fileobj for non-empty regular files
with open(file_name, "rb") as fileobj:
tar.addfile(tarinfo, fileobj)
else:
# Original code - unchanged
tar.addfile(tarinfo)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chengzhuzhang @xylar -- what Python versions do we need to support for the next E3SM Unified? Is anything above 3.10 ok?

From Claude:

Based on my research, this is indeed a breaking change in Python 3.13 (and potentially 3.12+). The issue is that starting in Python 3.13, TarFile.addfile() now requires a fileobj parameter when adding regular files with non-zero size tarfile — Read and write tar archive files — Python 3.13.7 documentation. This change was introduced to fix a security issue and makes the behavior more consistent, but it breaks existing code that called addfile() without providing a file object for non-empty files

That is, an actual functionality change is needed, not just changes in DevOps. The functionality change in this file (zstash/hpss_utils.py) seems to pass the GitHub Actions tests for 3.11, 3.12, and 3.13 but I'm a little concerned the disparity between 3.11/3.12 and 3.13 code will break zstash backwards compatibility. E.g., is it possible that tars created under 3.12 might not be the same size as those extracted under 3.13? I'm not sure.

So, if we don't need to support Python 3.13 right now, I'd be inclined to leave this actual functionality change to after this upcoming Unified release. Alternatively, we could devote some time to the Unified testing period to create and extract with different Python versions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewdnolan ^ This is what I mentioned in the EZ meeting re: Python 3.13 support

Copy link
Collaborator

@tomvothecoder tomvothecoder Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g., is it possible that tars created under 3.12 might not be the same size as those extracted under 3.13? I'm not sure.

Unit tests would help verify this here. If the tests pass between versions, even with logical changes, then you're good to go.

Otherwise a quick sanity test is to write a Python script using both tar functions and comparing the results.

If both tar functions produce the same results, you can remove the old tar functions entirely (if the new ones support Python 3.11-3.12 too). There would be no need to have both around in this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g., is it possible that tars created under 3.12 might not be the same size as those extracted under 3.13? I'm not sure.

No, I do not think there is any way that could happen. tar files are really fundamental to Unix and should always be backwards compatible. This does not have to mean that tar files are the same size as each other with different python versions but I would expect they might be. There could be differences in metadata or file structure that lead to different file sizes but that don't imply incompatibility.

It is very important that zstash supports all the python version that Unified supports. We cannot have any of our core software opting out of some versions and into others.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@forsyth2, looking at the docs, https://docs.python.org/3/library/tarfile.html, it seems that tar.addfile() supported the fileobj parameter in python 3.11 (and even earlier) but that it is now required in python 3.13. I think you can just provide it for all python versions and not have the selector here:

Suggested change
# Add the file to the tar - ONLY change this line for Python 3.13+
if (
sys.version_info >= (3, 13)
and (tarinfo.isfile() or tarinfo.islnk())
and tarinfo.size > 0
):
# Python 3.13+ requires fileobj for non-empty regular files
with open(file_name, "rb") as fileobj:
tar.addfile(tarinfo, fileobj)
else:
# Original code - unchanged
tar.addfile(tarinfo)
with open(file_name, "rb") as fileobj:
tar.addfile(tarinfo, fileobj)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, presumably different logic should not be needed below for python >=3.13 and python <3.13.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for future reference, Python and Python-based packages typically implement FutureWarning and/or DeprecationWarning to warn users ahead of time for these kinds of things.

I am usually able to catch these early by scanning the unit test suite and fixing them subsequently before it's too late. If unit tests are not implemented, then you may notice warnings in the logs during production usage.

The tar module probably raised this in an earlier version of Python.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomvothecoder @xylar Thanks for the suggestions. I added a new commit: 884e994. Please review. That is passing on all the GitHubActions -- tests for 3.11, 3.12, 3.13.

Unit tests would help verify this here.

The zstash unit test suite could be improved upon.

It's very hard to do true "unit" testing on zstash because almost everything requires I/O (i.e., you can't test abstract functions with no side effects). Lately, I've been more inclined to write tests in bash that simulate real-world workflows with zstash. For example: https://github.com/E3SM-Project/zstash/blob/main/tests/scripts/globus_auth.bash.

The zstash tests are also written to use unittest rather than pytest, so that needs to be modernized. (I recall trying to do so once but running into a roadblock).

So ideally there would both be more testing and more robust testing, yes, but implementing that is infeasible on the eve of release.

E.g., is it possible that tars created under 3.12 might not be the same size as those extracted under 3.13?

The more I think about this, this is a more general concern with zstash. It's quite conceivable someone would try to zstash extract an archive created with zstash create from many versions ago. It would be good to know that will always work. For instance, when we added the tars (in addition to the files) to the database, there had to be thorough backwards compatibility checks.

Perhaps something to add in the suggested/eventual unit test update.

catch these early by scanning the unit test suite
The tar module probably raised this in an earlier version of Python.

I never noticed any such warnings; I have seen warnings in zppy unit tests however. I wonder if it's related to pytest being better about reporting warnings than unittest. Or perhaps that on main, we have python-version: 3.9 in tests/scripts/globus_auth.bash and python=3.9.13 in conda/dev.yml, which is 2 behind 3.11 and 4 behind 3.13

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very hard to do true "unit" testing on zstash because almost everything requires I/O (i.e., you can't test abstract functions with no side effects).

I'm not familiar with the functionalities of zstash exactly so I might be off here, but I'd imagine you can write unit tests that utilize pytests temporary directory to write/read files which you then pass to zstash functions to get a result. Of course this doesn't consider different machines filesystems and what not, but it at least tests the direct functions to ensure it doesn't unexpectedly break between changes such as swapping tar functions.

Just something to think about for future test suite enhancements.

Copy link
Collaborator

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion for the DevOps. I did not review the logic in hpss_utils.py since you tagged Jill and Xylar for that.

channel-priority: strict
channel-priority: flexible # Changed from strict to flexible
auto-update-conda: true
python-version: "3.11" # Use stable Python version for docs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
python-version: "3.11" # Use stable Python version for docs
python-version: "3.13" # Use stable Python version for docs

Comment on lines 322 to 343
while True:
s: bytes = f.read(BLOCK_SIZE)
if len(s) > 0:
# If the block read in is non-empty, write it to fileobj and update the hash
tar_fileobj.write(s)
hash_md5.update(s)
if len(s) < BLOCK_SIZE:
# If the block read in is smaller than BLOCK_SIZE,
# then we have reached the end of the file.
# blocks = how many blocks of tarfile.BLOCKSIZE fit in tarinfo.size
# remainder = how much more content is required to reach tarinfo.size
blocks: int
remainder: int
blocks, remainder = divmod(tarinfo.size, tarfile.BLOCKSIZE)
if remainder > 0:
null_bytes: bytes = tarfile.NUL
# Write null_bytes to get the last block to tarfile.BLOCKSIZE
tar_fileobj.write(null_bytes * (tarfile.BLOCKSIZE - remainder))
blocks += 1
# Increase the offset by the amount already saved to the tar
tar.offset += blocks * tarfile.BLOCKSIZE
break
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@golaz Can you please review this diff? I can't tell if this tracking of tar.offset is actually needed. From what Claude tells me, that tar.addfile(tarinfo, fileobj) should be taking care of everything.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should note the tests pass on GitHub Actions using Python 3.11, 3.12, 3.13.

conda/dev.yml Outdated
Comment on lines 15 to 18
- black >=24.0.0
- flake8 >=7.0.0
- flake8-isort >=6.0.0
- mypy >=1.11.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomvothecoder, is it your philosophy that these need to match .pre-commit-config.yaml exactly?

@forsyth2, presumably .pre-commit-config.yaml should also be updated with its autoupdate feature.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomvothecoder, is it your philosophy that these need to match .pre-commit-config.yaml exactly?

Correct, these should align exactly with .pre-commit-config.yaml with exact version pinned as you mentioned before.

Copy link
Contributor

@xylar xylar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@forsyth2, the tar related code changes look good to me. I think not having separate blocks for different python versions is a good direction to take.

I'm approving the PR on the assumption that you and @tomvothecoder will work out how you want to handle the pre-commit dependencies both in the pre-commit config file and in the conda environment.

Copy link
Collaborator

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My review, remaining fixes:

  1. Align QA tooling versions, specifically conda environments with pre-commit using my key points at the bottom of this review.
  2. Finalize the tar related code changes.

After that, it should be good to go.


@forsyth2, the tar related code changes look good to me. I think not having separate blocks for different python versions is a good direction to take.

I agree with @xylar here.

I'm approving the PR on the assumption that you and @tomvothecoder will work out how you want to handle the pre-commit dependencies both in the pre-commit config file and in the conda environment.

@forsyth2 The key points are:

  1. Pin exact versions of QA tools
    • Don’t use version ranges—always specify the exact version.
    • This guarantees consistent behavior between your local development environment and CI.
  2. Use the same versions across all environments
    • QA tools should match exactly between conda environments (from conda-forge) and pre-commit hooks (from PyPI).
    • If pre-commit autoupdate pulls a version that’s newer than what’s available on conda-forge, you should fall back to the latest version on conda-forge instead for both pre-commit and Conda environments.

conda/meta.yaml Outdated
Comment on lines 23 to 26
- python >=3.9
- fair-research-login >=0.2.6,<0.3.0
- globus-sdk >=3.0.0,<4.0.0
- six
- python >=3.11,<3.14
- globus-sdk=3.15.0
- six=1.16.0
- sqlite
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@forsyth2 Did you mean to constrain exact versions of globus-sdk and six here?

On a side-note, I don't think this file is used anymore? I'm pretty sure this file was used to host zstash on the e3sm channel, which we no longer do since adopting conda-forge.

I think you can just delete this file (meta.yaml) if that's the case.

@forsyth2
Copy link
Collaborator Author

forsyth2 commented Oct 2, 2025

@xylar @tomvothecoder I added 2caba4e to address your comments. GitHub Actions are passing for 3.11, 3.12, 3.13.

Correct, these should align exactly with .pre-commit-config.yaml with exact version pinned as you mentioned before.
Pin exact versions of QA tools
Use the same versions across all environments

I made the pre-commit dependencies use the same versions as in E3SM-Project/zppy#735 & E3SM-Project/zppy-interfaces#33

I don't think this file is used anymore

I deleted conda/meta.yaml. I myself was wondering what that was used for.

Finalize the tar related code changes.

This remains the final piece. I imagine it's probably an ok change.

Copy link
Collaborator Author

@forsyth2 forsyth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@golaz It looks like my original comment got removed because the code changed. Can you please review the diff for zstash/hpss_utils.py? Thanks!

Comment on lines -304 to -317
# If the block read in is smaller than BLOCK_SIZE,
# then we have reached the end of the file.
# blocks = how many blocks of tarfile.BLOCKSIZE fit in tarinfo.size
# remainder = how much more content is required to reach tarinfo.size
blocks: int
remainder: int
blocks, remainder = divmod(tarinfo.size, tarfile.BLOCKSIZE)
if remainder > 0:
null_bytes: bytes = tarfile.NUL
# Write null_bytes to get the last block to tarfile.BLOCKSIZE
fileobj.write(null_bytes * (tarfile.BLOCKSIZE - remainder))
blocks += 1
# Increase the offset by the amount already saved to the tar
tar.offset += blocks * tarfile.BLOCKSIZE
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears we don't actually need to be tracking/updating tar.offset here.

Copy link
Collaborator Author

@forsyth2 forsyth2 Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After following addfile back through the git blame functionality, it appears to be from the very beginning of zstash: https://github.com/E3SM-Project/zstash/pull/2/files#diff-51246e53255db77c9edad496f074aa1bdbf8dbdc11f89a02040115c9ab4fa7f0 has the following.

# Add file to tar archive while computing its hash
# Return file offset (in tar archive), size and md5 hash
def addfile(tar, file):
    offset = tar.offset
    tarinfo = tar.gettarinfo(file)
    tar.addfile(tarinfo)
    if tarinfo.isfile():
        f = open(file, "rb")
        hash_md5 = hashlib.md5()
        while True:
            s = f.read(BLOCK_SIZE)
            if len(s) > 0:
                tar.fileobj.write(s)
                hash_md5.update(s)
            if len(s) < BLOCK_SIZE:
                blocks, remainder = divmod(tarinfo.size, tarfile.BLOCKSIZE)
                if remainder > 0:
                    tar.fileobj.write(tarfile.NUL *
                                      (tarfile.BLOCKSIZE - remainder))
                    blocks += 1
                tar.offset += blocks * tarfile.BLOCKSIZE
                break
        f.close()
        md5 = hash_md5.hexdigest()
    else:
        md5 = None
    size = tarinfo.size
    mtime = datetime.utcfromtimestamp(tarinfo.mtime)
    return offset, size, mtime, md5

So, I just want to make sure the new tar.addfile takes care of the extra stuff in this function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reading of the documentation is that yes, the tar file is updated already with the tar.add_file() call when a fileobj is passed:
https://docs.python.org/3/library/tarfile.html

The advantage of the original implementation was probably that each file got opened once only and was added to the archive at the same time as computing its md5 hash. The new code reopens the file to create the md5 hash but I think that's fine (and seemingly unavoidable with the new API).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, but I cannot approve this change. The ability to stream data to tar files and compute hashes at the same time was a key functionality of zstash. Computation of md5 hashes is expensive, removing this functionality will have a significant performance impact.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude's feedback:

If the old implementation breaks on Python 3.13 (likely due to changes in how tarfile handles internal file objects), then you're facing a compatibility vs. performance trade-off.

Options to Consider

1. Accept the performance hit (pragmatic choice)

  • The new implementation works on Python 3.13
  • Performance regression may be acceptable depending on:
    • Typical file sizes in your use case
    • Available I/O bandwidth (SSDs make double-reads less painful)
    • Whether this is a bottleneck in practice

2. Implement a hybrid solution (best of both worlds)

def add_file(tar, file_name, follow_symlinks):
    offset = tar.offset
    tarinfo = tar.gettarinfo(file_name)
    
    if tarinfo.islnk():
        tarinfo.size = os.path.getsize(file_name)
    
    md5 = None
    
    # For files/hardlinks, stream data while computing hash
    if (tarinfo.isfile() or tarinfo.islnk()) and tarinfo.size > 0:
        hash_md5 = hashlib.md5()
        
        # Create a wrapper that computes hash while data passes through
        class HashingFileWrapper:
            def __init__(self, fileobj, hasher):
                self.fileobj = fileobj
                self.hasher = hasher
            
            def read(self, size=-1):
                data = self.fileobj.read(size)
                if data:
                    self.hasher.update(data)
                return data
        
        with open(file_name, "rb") as f:
            wrapper = HashingFileWrapper(f, hash_md5)
            tar.addfile(tarinfo, wrapper)
        
        md5 = hash_md5.hexdigest()
    else:
        tar.addfile(tarinfo)
    
    size = tarinfo.size
    mtime = datetime.utcfromtimestamp(tarinfo.mtime)
    return offset, size, mtime, md5

This wrapper approach:

  • ✅ Uses the proper Python 3.13-compatible API
  • ✅ Maintains single-pass streaming
  • ✅ Computes hash during the tar write operation
  • ✅ Clean, maintainable code

3. Version-specific implementations

Use the old code for Python <3.13 and new code for 3.13+, but this adds maintenance burden.

My Recommendation

Try the hybrid solution first (Option 2). It should give you Python 3.13 compatibility while preserving the streaming performance that Reviewer 2 correctly identified as important. If tar.addfile() properly uses the file-like object's read() method, this should work perfectly.

If the wrapper approach doesn't work for some reason, then you'll need to accept the performance regression as the cost of Python 3.13 support, but at least you'll have tried to preserve the optimization.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@golaz, I will veto any solution that does not allow us to advance to the latest stable python (python 3.13), since this is something I have been fighting to have happen in Unified for years and it has finally become possible because of dropping CDAT.

It seems like zstash has not been getting the TLC it has needed over the years to remain compatible with modern python and that's pretty disappointing. I think @forsyth2 is spread too thin and no one else has stepped up. Goodness knows I'm spread too thin.

Hopefully, we can get things there for Unified, since testing should start on Monday. Hopefully, we can also find a longer term solution for zstash to have a "champion".

Copy link
Collaborator

@chengzhuzhang chengzhuzhang Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I haven’t followed this thread too closely. Which Python version is causing the incompatibility with zstash? I think it’s a solid achievement that we could update from 3.10 (now end-of-life) to 3.11 (or higher, if our tools are ready) for Unified with all the work we put in to migrate from CDAT. In this case, it seems zstash isn’t ready yet, and we are time-constrained to update it with thorough testing.

Can we target Python 3.11 or 3.12 (depending on zstash readiness) for this upcoming unified? Both are still widely used, and 3.11 will continue receiving security updates until October 2027. Again, for a tool like E3SM-unified, I believe we should continue prioritizing maximum compatibility.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AAAAAAaargh!!

Okay, I think it would be important to find out if zstash works with python 3.12 and without requiring these changes. If we only support python 3.11 in zstash and Unified, that's not good. That's the state we just got out of with dropping CDAT.

I also never want this to happen again with a future Unified. All our software must, must, must support the 3 latest stable versions of python at all times. This is just necessary for our ecosystem to function well. Otherwise, the neglected tools really drag down the tools that are trying to take advantage of more modern capabilities.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could get behind the concept of option 2.
Hopefully, we can get things there for Unified, since testing should start on Monday.

I've implemented this in the latest commit: cc37237. It required an additional change to handle empty files. It now passes the GitHub Actions for 3.11, 3.12, and 3.13.

I believe this means this PR should be good to merge. @xylar @golaz Can you please re-confirm based on this latest commit?

zstash has not been getting the TLC it has needed over the years to remain compatible with modern python
Hopefully, we can also find a longer term solution for zstash to have a "champion".

Yes, I agree with all this. After the Unified testing period, perhaps it would be useful to have a team discussion about this. The problem is certainly not a lack of ideas re: code maintainability (e.g., removing all those global variables or improving the testing).

Which Python version is causing the incompatibility with zstash?

3.13

find out if zstash works with python 3.12 and without requiring these changes.

It does. I only ran into errors when I started matrix-testing on 3.13.

All our software must, must, must support the 3 latest stable versions of python at all times.

This is good to keep in mind, thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we target Python 3.11 or 3.12 (depending on zstash readiness) for this upcoming unified? Both are still widely used, and 3.11 will continue receiving security updates until October 2027. Again, for a tool like E3SM-unified, I believe we should continue prioritizing maximum compatibility.

I also never want this to happen again with a future Unified. All our software must, must, must support the 3 latest stable versions of python at all times. This is just necessary for our ecosystem to function well. Otherwise, the neglected tools really drag down the tools that are trying to take advantage of more modern capabilities.
Here’s a tightened version of what you have now—clear, direct, and avoids repetition:

I agree with @chengzhuzhang that maximizing compatibility is essential, and with @xylar that we must support the latest stable Python versions (currently 3.11, 3.12, 3.13). Falling behind with Python support (mainly due to CDAT) means we’re not truly maximizing compatibility and leaves us playing catch-up with the broader ecosystem.

Our sub-dependencies (e.g., numpy) already expect newer Python versions, which forces @xylar into hacks and workarounds to get things working. Major scientific packages generally follow this Python support spec: https://scientific-python.org/specs/spec-0000/. We might consider adopting this more formally as a guideline to ensure our packages are always ready moving forward.

Comment on lines 285 to 289
if (tarinfo.isfile() or tarinfo.islnk()) and tarinfo.size > 0:
with open(file_name, "rb") as fileobj:
tar.addfile(tarinfo, fileobj)
else:
tar.addfile(tarinfo)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the newly supported versions of Python, addfile writes to fileobj.

Copy link
Collaborator

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to the versions of QA packages look good to me.

Copy link
Collaborator

@golaz golaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot approve. I don't understand why we would remove a key functionality of zstash with likely severe performance impact simply to update to a new version of Python.

Comment on lines -304 to -317
# If the block read in is smaller than BLOCK_SIZE,
# then we have reached the end of the file.
# blocks = how many blocks of tarfile.BLOCKSIZE fit in tarinfo.size
# remainder = how much more content is required to reach tarinfo.size
blocks: int
remainder: int
blocks, remainder = divmod(tarinfo.size, tarfile.BLOCKSIZE)
if remainder > 0:
null_bytes: bytes = tarfile.NUL
# Write null_bytes to get the last block to tarfile.BLOCKSIZE
fileobj.write(null_bytes * (tarfile.BLOCKSIZE - remainder))
blocks += 1
# Increase the offset by the amount already saved to the tar
tar.offset += blocks * tarfile.BLOCKSIZE
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, but I cannot approve this change. The ability to stream data to tar files and compute hashes at the same time was a key functionality of zstash. Computation of md5 hashes is expensive, removing this functionality will have a significant performance impact.

@golaz golaz self-requested a review October 3, 2025 20:23
Copy link
Collaborator

@golaz golaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We reviewed code together. The new implementation with the class HashingFileWrapper resolved my earlier concerns about performance hit. It is also much more elegant than the original implementation.

Thanks @forsyth2 for coming up with this alternate solution so quickly.

@forsyth2 forsyth2 merged commit b3ddfa0 into main Oct 3, 2025
5 checks passed
@forsyth2 forsyth2 deleted the update-python branch October 3, 2025 20:51
@forsyth2
Copy link
Collaborator Author

forsyth2 commented Oct 3, 2025

Thanks @golaz!

@forsyth2
Copy link
Collaborator Author

forsyth2 commented Oct 3, 2025

@xylar @tomvothecoder While the main functionality is ok, the docs workflow turned out to have an issue once merged to main: https://github.com/E3SM-Project/zstash/actions/runs/18233515307/job/51922401478

Claude says the following. It says option 1 is best, but I'm thinking option 2. Do you have a preference?

From Claude:

The error occurs because the imghdr module was removed in Python 3.13. This module was deprecated in Python 3.11 and completely removed in 3.13, but Sphinx's epub3 builder still tries to import it.

Here are your options to fix this:

Option 1: Use Python 3.12 for Documentation (Recommended)

Change the Python version in your publish-docs job from 3.13 to 3.12:

- name: Build Conda Environment
  uses: conda-incubator/setup-miniconda@v3
  with:
    activate-environment: zstash_dev
    miniforge-variant: Miniforge3
    miniforge-version: latest
    environment-file: conda/dev.yml
    channel-priority: flexible
    auto-update-conda: true
    python-version: "3.12"  # Changed from 3.13 to 3.12

Option 2: Upgrade Sphinx

If you're using an older version of Sphinx, upgrade to Sphinx 7.2+ which has Python 3.13 support. Add this step before building docs:

- name: Upgrade Sphinx for Python 3.13
  run: |
    python -m pip install --upgrade "sphinx>=7.2"

Option 3: Disable epub Builder

If you don't need epub output, disable it in your Sphinx conf.py:

# In docs/source/conf.py
extensions = [
    # your other extensions...
]

# Explicitly disable epub builder
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']

My recommendation: Go with Option 1 (use Python 3.12 for docs) since it's the simplest fix and documentation building doesn't need to be on the bleeding edge. You can still test your code on Python 3.13 in the build job while using a stable Python version for documentation.

@xylar
Copy link
Contributor

xylar commented Oct 3, 2025

@forsyth2, I think the problem is with:

  # Need to pin docutils because 0.17 has a bug with unordered lists
  # https://github.com/readthedocs/sphinx_rtd_theme/issues/1115
  - docutils >=0.16,<0.17

This is a very old version of docutils and forces you to use a very old version of sphinx that doesn't support python 3.13 properly.

I believe you want:

docultils >=0.17.1

as that is the first version to support python 3.13 (and also avoids the bug in 0.17.0 mentioned in the comment).

@xylar
Copy link
Contributor

xylar commented Oct 3, 2025

For future reference, good way to avoid a buggy version might have been:

docutils >=0.16,!=0.17.0

But that is rendered moot because we need a newer version than the buggy one in any case.

@forsyth2 forsyth2 mentioned this pull request Oct 3, 2025
7 tasks
@forsyth2
Copy link
Collaborator Author

forsyth2 commented Oct 3, 2025

Thanks @xylar, see #388 (review)

@forsyth2 forsyth2 mentioned this pull request Oct 7, 2025
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

DevOps CI/CD, configuration, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants